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ABSTRACT 

To determine whether and how common polymorphisms are associated with natural distributions of 
iris colors, we surveyed 851 individuals of mainly European descent at 335 SNP loci in 13 pigmentation 
genes and 419 other SNPs distributed throughout the genome and known or thought to be informative 
for certain elements of population structure. We identified numerous SNPs, haplotypes, and diplotypes 
(diploid pairs of haplotypes) within the OCA2, MY05A, TYRP1, AIM, DCT, and TYR genes and the 
CyPM2-15q22-ter, CYPlBl-2p21, CM°2CcM0q23, CTP2C9-10q24, and MAOA-Xpl 1.4 regions as significantly 
associated with iris colors. Half of the associated SNPs were located on chromosome 15, which corresponds 
with results that others have previously obtained from linkage analysis. We identified 5 additional genes 
(ASIP, MC1R, POMC, and SILV) and one additional region (GS7T2-22qll.23) with haplotype and/or 
diplotypes, but not individual SNP alleles associated with iris colors. For most of the genes, multilocus 
gene-wise genotype sequences were more strongly associated with iris colors than were haplotypes or SNP 
alleles. Diplotypes for these genes explain 15% of iris color variation. Apart from representing the first 
comprehensive candidate gene study for variable iris pigmentation and constituting a first step toward 
developing a classification model for the inference of iris color from DNA, our results suggest that cryptic 
population structure might serve as a leverage tool for complex trait gene mapping if genomes are screened 
with the appropriate ancestry informative markers. 



TRIS pigmentation is a complex genetic trait that has 
A long interested geneticists, anthropologists, and the 
public at large. However, it is yet to be completely under- 
stood. Eumelanin (brown pigment) is a light-absorbing 
polymer synthesized in specialized melanocyte lyso- 
somes called melanosomes. Within the melanosomes, 
the tyrosinase (TYR) gene product catalyzes the rate- 
limiting hydroxylation of tyrosine to 3, 4-dihydroxyphe- 
nylanine (DOPA) , and the resulting product is oxidized 
to DOPAquinone to form the precursor for eumelanin 
synthesis. Although TYR is centrally important for this 
process, pigmentation in animals is not simply a Mende- 
lian function of TYR or of any other single protein 
product or gene sequence. In fact, study of the transmis- 
sion genetics for pigmentation traits in humans and 
various model systems suggests that variable pigmenta- 
tion is a function of multiple heritable factors whose 
interactions appear to be quite complex (Brauer and 
Chopra 1978; Bito et al. 1997; Box et al. 1997, 2001; 
Akey et al. 2001 ; Sturm et al. 2001 ) . For example, unlike 
human hair color (Sturm et al. 2001), there appears to 
be only a minor dominance component for mammalian 
iris color determination (Brauer and Chopra 1978), 
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and minimal correlation exists among skin, hair, and 
iris color within or between individuals of a given popu- 
lation. In contrast, between-population comparisons 
show good concordance; populations with darker aver- 
age iris color also tend to exhibit darker average skin 
tones and hair colors. These observations suggest that 
the genetic determinants for pigmentation in the vari- 
ous tissues are distinct and that these determinants have 
been subject to a common set of systematic and evolu- 
tionary forces that have shaped their distribution in 
world populations. 

At the cellular level, variable iris color in healthy hu- 
mans is the result of the differential deposition of mela- 
nin pigment granules within a fixed number of stromal 
melanocytes in the iris (Imesch et al. 1997) . The density 
of granules appears to reach genetically determined 
levels by early childhood and usually remains constant 
throughout later life, although a small minority of indi- 
viduals exhibit changes in color during later stages of 
life (Bito et al 1997). Pedigree studies in the mid-1970s 
suggested that iris color variation is a function of two 
loci: a single locus responsible for depigmentation of 
the iris, not affecting skin or hair, and another pleiotro- 
pic gene for reduction of pigment in all tissues (Brues 
1975). Most of what we have learned about pigmenta- 
tion since has been derived from molecular genetics 
studies of rare pigmentation defects in humans and 
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model systems such as mouse and Drosophila. For exam- 
ple, dissection of the oculocutaneous albinism (OCA) 
trait in humans has shown that many pigmentation de- 
fects are due to lesions in the 717? gene, resulting in 
their designation as 7TR-negative OCAs (Oetting and 
King 1991, 1992, 1993, 1999; see albinism database at 
http://www.cbc.umn.edu/tad/). TYR catalyzes the rate- 
limiting step of melanin biosynthesis and the degree to 
which human irises are pigmented correlates well with 
the amplitude of TYR message levels (Lindsey et al. 
2001). Nonetheless, the complexity of OCA phenotypes 
illustrates that TYR is not the only gene involved in iris 
pigmentation (Lee et al. 1994). Although most TYR- 
negative OCA patients are completely depigmented, 
dark-iris albino mice (C44H) and their human type IB 
oculocutaneous counterparts exhibit a lack of pigment 
in all tissues except for the iris (Schmidt and Beermann 
1994). Study of a number of other TKR-positive OCA 
phenotypes has shown that, in addition to TYR, the 
oculocutaneous 2 (OCA2; Hamabe et al. 1991; Gardner 
etal. 1992; Durham-Pierre etal. 1994, 1996) , tyrosinase- 
like protein ( TYRP1; Abbott et al. 1991 ; Chintamaneni 
et al. 1991; Boissy et al. 1996), melanocortin receptor 
(MC1R; Robbins et al. 1993; Smith et al. 1998; Flana- 
gan et al. 2000), and adaptin 3B (AP3B) loci (Ooi et al. 
1997), and other genes (reviewed by Sturm et al. 2001) 
are necessary for normal human iris pigmentation. Each 
of these genes is part of the main ( 717?) human pigmen- 
tation pathway. In Drosophila, iris pigmentation defects 
have been ascribed to mutations in >85 loci contribut- 
ing to a variety of cellular processes in melanocytes (Ooi 
et al. 1997; Lloyd et al. 1998), but mouse studies have 
suggested that ~14 genes preferentially affect pigmenta- 
tion in vertebrates (reviewed in Sturm et al. 2001) and 
that disparate regions of the TYR and other OCA genes 
are functionally distinct for determining the pigmenta- 
tion in different tissues. Human pigmentation genes 
break out into several biochemical pathways, including 
those for tyrosinase enzyme complex formation on the 
inner surface of the melanosome, hormonal and envi- 
ronmental regulation, melanoblast migration and dif- 
ferentiation, the intracellular routing of new proteins 
into the melanosome, and the proper transportation of 
the melanosomes from the body of the cell into the 
dendritic arms toward the keratinocytes. Nonetheless, 
the study of human OCA mutants suggests that the 
number of highly penetrant phenotypically active pig- 
mentation loci is surprisingly small. 

Although research on pigment mutants has made 
clear that a small subset of genes is largely responsible 
for catastrophic pigmentation defects in mice and hu- 
mans, it remains unclear whether or how common sin- 
gle-nucleotide polymorphisms (SNPs) in these genes 
contribute toward (or are linked to) natural variation 
in human iris color. A brown-iris locus was localized 
to an interval containing the OCA2 and MY05A genes 
(Eiberg and Mohr 1996), and specific polymorphisms 



in the MC1R gene have been shown to be associated 
with red hair and blue iris color in relatively isolated 
populations (Robbins et al. 1993; Valverde et al. 1995; 
Koppula et al. 1997; Smith et al. 1998; Schioth et al. 
1999; Flanagan et al. 2000). An ASIP polymorphism is 
reported to be associated with both brown iris and hair 
color (Kanetsky et al. 2002). However, the penetrance 
of each of these alleles appears to be low and, in general, 
they appear to explain but a very small amount of the 
overall variation in iris colors within the human popula- 
tion (Spritz 1995). However, single-gene studies have 
not provided a sound basis for understanding the com- 
plex genetics of human iris color. Because most human 
traits have complex genetic origins, wherein the whole 
is often greater than the sum of its parts, innovative 
genomics-based study designs and analytical methods 
for screening genetic data in silico that are respectful of 
genetic complexity are needed — for example, the multi- 
factorial and/ or phase-known components of domi- 
nance and epistatic genetic variance. The first step, how- 
ever, is to define the complement of loci that on a 
sequence level explain variance in trait value and, of 
these, those that do so in a marginal or penetrant sense 
will be the easiest to find. It is toward this goal that we 
have performed the present study. 

We have applied a nonsystematic, hypothesis-driven 
genome-screening approach to identify various SNPs, 
haplotypes, and diplotypes marginally (i.e., indepen- 
dently) associated with iris color variation. Our results 
show that a surprisingly large number of polymorphisms 
in a large number of genes are associated with iris colors, 
suggesting that the genetics of iris color pigmentation 
are quite complex. The sequences we have identified 
constitute a good first step toward developing a classifier 
model for the inference of iris colors from DNA, and 
the nature of some of these as markers of population 
structure might have implications for the design of other 
complex trait gene-mapping studies. 

MATERIALS AND METHODS 

Specimens: Specimens for resequencing were obtained 
from the Coriell Institute in Camden, New Jersey. Specimens 
for genotyping were of self-reported European descent, of 
different age, sex, hair, iris, and skin shades and they were 
collected using informed consent guidelines under Investiga- 
tional Review Board guidance. Donors checked a box for blue, 
green, hazel, brown, black, or unknown/not clear iris colors, 
and each had the opportunity to identify whether iris color 
had changed over the course of their lives or whether the 
color of each iris was different. Individuals for whom iris color 
was ambiguous or had changed over the course of life were 
eliminated from the analysis. In addition, for 103 of the sub- 
jects, iris colors were reported using a number from 1 to 1 1 
as well, where 1 is the darkest brown/black and 11 is the 
lightest blue, identified using a color placard. For these sub- 
jects, we obtained digital photographs of the right iris, where 
subjects peered into a box at one end at the camera at the 
other end to standardize lighting conditions and distance and 
from which a judge assigned the sample to a color group. 
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Comparing the results of the two methods of classification, 
86 of the classifications matched. Of the 17 that did not, 6 
were brown/hazel, 7 were green/hazel, and 4 were blue/ 
green discrepancies although none were gross discrepancies 
such as brown/green, brown/blue, or hazel/blue. Although 
such an error is tolerable for identifying sequences marginally 
associated with iris colors, the use of the sequences described 
herein for iris color classification would therefore likely re- 
quire digitally quantified iris colors (which we have begun to 
accumulate and will present elsewhere) . 

SNP discovery: We obtained candidate SNPs from the Na- 
tional Center for Biotechnology Information (NCBI) Single 
Nucleotide Polymorphism Database (dbSNP) , which generally 
provided more candidate SNPs than were possible to geno- 
type. We focused on human pigmentation and xenobiotic 
metabolism genes, selected on the basis of their gene identi- 
ties, not their chromosomal position. For some genes, the 
number of SNPs in the database was low and/or some of the 
SNPs were strongly associated with iris colors, warranting a 
deeper investigation. For these genes we performed rese- 
quencing and of the genes discussed in this article, 113 SNPs 
were discovered in CYP1A2 (7 gene regions, 5 amplicons, 10 
SNPs found), CYP2C8 (9 gene regions, 8 amplicons, 15 SNPs 
found) , CYP2C9 (9 gene regions, 8 amplicons, 24 SNPs found) , 
OCA2 (16 gene regions, 15 amplicons, 40 SNPs found), TYR 
(5 gene regions, 5 amplicons, 10 SNPs found), and TYRP1 (7 
gene regions, 6 amplicons, 14 SNPs found). Resequencing 
for these genes was performed by amplifying the proximal 
promoter (average 700 bp upstream of transcription start site), 
each exon (average size 1400 bp), the 5' and 3' ends of each 
intron (including the intron-exonjunctions, average size ~100 
bp), and 3' untranslated region (UTR; average size 700 bp) 
sequences from a multi-ethnic panel of 672 individuals (450 
individuals from the Coriell Institute's DNA Polymorphism 
Discovery Resource, 96 additional European Americans, 96 
African Americans, 10 Pacific Islanders, 10 Japanese, and 10 
Chinese; these 672 individuals represented a set of samples 
separate from that used for the association study described 
herein) . PCR amplification was accomplished using pfu Turbo 
polymerase according to the manufacturer's guidelines (Stra- 
tagene, Lajolla, CA). We developed a program (T. Frudakis, 
M. Thomas, Z. Gaskin, K. Venkateswarlu, K. Suresh Chan- 
dra, S. Ginjupaixi, S. Gunturi, S. Natrajan, V. K. Ponnus- 
wamy and K. N. Ponnuswamy, unpublished results) to design 
resequencing primers in a manner respectful of homologous 
sequences in the genome, to ensure that we did not coamplify 
pseudogenes or amplify from within repeats. BLAST searches 
confirmed the specificity of all primers used. Amplification 
products were subcloned into the pTOPO (Invitrogen, San 
Diego) sequencing vector and 96 insert-positive colonies were 
grown for plasmid DNA isolation (the use of 670 individuals 
for the amplification step reduced the likelihood of an individ- 
ual contributing more than once to this subset of 96 selected). 
We sequenced with an ABI3700 using PE Applied Biosystems 
BDT chemistry and we deposited the sequences into a com- 
mercial relational database system (iFINCH, Geospiza, Seat- 
de). PHRED-qualified sequences were imported into the 
CLUSTAL X alignment program and the output of this was 
used with a second program that we developed (T. Frudakis, 
M. Thomas, Z. Gaskin, K. Venkateswarlu, K Suresh Chan- 
dra, S. Ginjupaixi, S. Gunturi, S. Natrajan, V. K. Ponnus- 
wamy and K. N. Ponnuswamy, unpublished results) to iden- 
tify quality-validated discrepancies between sequences. We 
selected those for which at least two instances of PHRED 
identified variants that scored 224, and each of these SNPs 
discovered through resequencing were used for genotyping. 

Genotyping: For most of the SNPs, a first round of PCR 
was performed on the samples using the high-fidelity DNA 



polymerase pfu Turbo and the appropriate resequencing prim- 
ers. Representatives of the resulting PCR products were 
checked on an agarose gel, and first-round PCR product was 
diluted and then used as template for a second round of PCR 
The two rounds were necessary due to the fact that many of 
the genes we queried were members of gene families, the SNPs 
resided in regions of sequence homology, and our genotyping 
platform required short (~100 bp) amplicons. For those re- 
maining, only a single round of PCR was performed. Genotyp- 
ing was performed for individual DNA specimens using a 
single base primer extension protocol and an SNPstream 25K/ 
ultra-high throughput (UHT) instrument (Beckman Coulter, 
Fullerton, CA, and Orchid Biosystems, Princeton, NJ). Geno- 
types were subject to several quality controls: two scientists 
independently pass/fail inspected the calls, requiring an over- 
all UHT signal intensity >1000 for >95% of genotypes and 
clear signal differential between the averages for each geno- 
type class (i.e., clear genotype clustering in two-dimensional 
space using the UHT analysis software). 

Statistical methods: To test the departures from indepen- 
dence in allelic state within and between loci, we used the 
exact test, described in Zaykin et al. (1995). Haplotypes were 
inferred using the Stephens et al. (2001) haplotype recon- 
struction method. To determine the extent to which extant 
iris color variation could be explained by various models, 
we calculated /? 2 values for SNPs, haplotypes, and multilocus 
genotype data by first assigning the phenotypic value for blue 
eye color as 1, green eye color as 2, hazel eye color as 3, and 
brown eye color as 4. Biogeographical ancestry admixture 
proportions were determined using the methods of Hanis et 
al. (1986) and Shriver et al. (2003) within the context of a 
software program we developed for this purpose, which will 
be presented elsewhere (T. Frudakis, Z. Gaskin, M. Thomas, 
V. Ponnuswamy, K. Venkateswarlu, S. Gunjupulli, C. Boni- 
lla, E. Parra and M. Shriver, personal communication). For 
R* computation, we used the following function: Adj-fl 2 = 1 — 
[n/ (« - p)] (1 - R*), where wis the model degrees of freedom 
and n — pis the error degrees of freedom. To correct for 
multiple tests, we used the empirical Bayes adjustments for 
multiple results method described by Steenland et al. (2000). 
Linkage disequilibrium (LD) for pairs of SNPs within a gene 
was determined using the Zaykin exact test and a cutoff value 
of |£>'| > 0.05 (lvalue < 0.05; Zaykin et al. 1995). 



RESULTS 

To identify SNP loci associated with variable human 
pigmentation, we genotyped for 754 SNPs: 335 SNPs 
within pigmentation genes (AP3B1, ASIP, DCT, MC1R, 
OCA2, SILV, TYR, TYRP1, MY05A, POMC, AIM, AP3D1, 
and RAB; Table 1), and 419 other SNPs distributed 
throughout the genome. Alleles for these latter SNPs 
were known to be informative for certain elements of 
population structure; 73 were selected from a screen 
of the human genome because they were exceptional 
ancestry informative markers (ATMs, based on high 8 
values) for Indo-European, sub-Saharan African, Native 
American, and East Asian biogeographical ancestry 
(BGA; Shriver et al. 2003; T. Frudakis, Z. Gaskin, 
M. Thomas, V. Ponnuswamy, K Venkateswarlu, S. 
Gunjupulli, C. Bonilla, E. Parra and M. Shriver, 
unpublished observations) . The rest were found in or 
around xenobiotic metabolism genes, which we have 
previously shown exhibit dramatic sequence variation 
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TABLE 1 








ssociations with human iris pigmentatu 


>n 


Gene 
ene 


Name 


Homology/model phenotype" 


References 


AP3B1 


Adaptor-related protein complex 3, 


Mouse "pearl" human HPS2 


Balkema el al. (1983) 




P-l subunit 


ASIP 


Agouti signaling protein 


Mouse "agouti" 


Kwon el al (1994) 








Klebig a al. (1995) 


DCT 


opac rome tautomerase 


TYR-related protein 2 mouse "slaty" Kwon (1993) 








Jackson el al. (1992) 


MC1R 


Melanocortm 1 receptor 


Mouse "extension" (e ADP) 


Robbins et al. (1993) 


OCA2 


Oculocutaneous albinism II 


Mouse pink-eyed dilution (p) 


Lyon et al. (1992) 






Gardner el al. (1992) 


SILV 


silver homologue 


Mouse "silver" (si) 


Kwon et al. (1991) 


TYR 




Mouse "albino" (c), Himalayan 


Barton et al. (1988) 








KiNGetal. (1989) 








Kwon et al. (1989) 


TYRP1 


Tyrosinase related protein 1 


Mouse "brown" (b) 


Jackson (1988) 


MY05A 


Myosin VA 


Mouse "dilute" (d) 


Copeland et al. (1983) 








Strobel et al. (1990) 


POMC 


Proopiomelanocortin 


Mouse Pomcl 


Krude et al. (1998) 


AIM(MATPo 


rAIM-T) Membrane associated transporter protein Mouse "underwhite" (uw) 


Newton et al. (2001) 


AP3D1 


Adaptor-related protein complex 3, 


Mouse "mocha" (mh) 






A-l subunit 






RAB 


RAB27A oncogene 


Mouse "ashen" (ash) 


Wilson et al. (2000) 


* Name of r 


nutant is in parentheses. 







as a function of BGA (Frudakis et al. 2003). Genotypes 
for these 754 candidate SNPs were scored for 851 Euro- 
pean-derived individuals of self-reported iris colors (292 
blue, 100 green, 186 hazel, and 273 brown). Before 
screening these genotypes for association with iris col- 
ors, we used the 73 nonxenobiotic metabolism AIMs to 
determine BGA admixture proportions for each sample 
and we tested for correlation between BGA admixture 
and iris colors. This test showed that each of our 851 
Caucasian samples was of majority Indo-European BGA, 
and although 58% of the samples were of significant 
(>4%) non-Indo-European BGA admixture, there was 
no correlation among low levels (<33%) of East Asian, 
sub-Saharan African, or Native American admixture and 
iris colors. For more extensively admixed individuals, 
we observed no correlation between higher levels 
(>33% but <50%) of Native American admixture and 
iris colors, although there was a weak association be- 
tween higher levels of East Asian and sub-Saharan Afri- 
can admixture and darker iris colors (data not shown). 

It was unclear from the outset whether we would have 
better success considering iris color in terms of four 
colors (blue, green, hazel, and brown) or in terms of 
groups of colors. One method of grouping colors is 
light = blue + green and dark = hazel + brown, and 
this grouping would seem to more clearly distinguish 
individuals with respect to the detectible level of eumela- 
nin (brown pigment). Given that our iris color data 
were self-reported, partitioning the sample into brown 
and not brown, or blue and not blue, could provide 



greater power to detect significant associations, particu- 
larly for alleles associated with blue or brown irises. 
To take advantage of each of these four methods, we 
considered all of them when screening SNPs for associa- 
tions; we calculated the 8 value, chi square, and exact 
test P values for (a) all four colors, (b) shades, using 
light (blue and green) vs. dark (hazel and brown), (c) 
blue vs. brown, and (d) brown vs. not brown (blue, 
green, and hazel) groupings. We fixed significance lev- 
els at 5%, and the alleles of 20 SNPs were found to be 
associated with specific iris colors, 19 with iris color 
shades, 19 with blue/brown color comparisons, and 18 
using the brown/not brown comparison. The overlap 
among these SNP sets was high but not perfect. In all, 27 
SNPs were significantly associated with iris pigmentation 
using at least one of the four criteria, and we refer to 
these as "marginally" associated. When multiple simulta- 
neous hypotheses are tested at set Pvalues, there is the 
possibility of enhanced type I error, so we used the 
correction procedure of Steenland et al. (2000) with 
adjusted residuals to compensate for this risk. We found 
that most of the associations were still significant after 
this correction (those with asterisks in Table 2 ), and 
since the analysis was conducted using adjusted residu- 
als, some new associations were observed (i.e., MAOA 
marker 2 had a chi-square Pvalue of 0.24 but was associ- 
ated using the corrected testing procedure; Table 2). 
Most of the marginally associated SNPs were found 
within the pigmentation genes OCA2 (n = 10), TYRP1 
(n = 4), AIM (n = 3), MY05A (n = 2), and DCT (n = 
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SNPs marginally (independently) associated with iris pigmentation and SNPs associated 
only within the context of haplotypes and/or diplotypes 





Haplotype 












Brown/ 


Brown/ 




Gene 


order" 


Sequence 


(minor)* 


HWE-f* 


Colors P 


Shade P 


blue P 


other P 


Pigment history 


AIM (MATP) 5pl3.3 




rs35391 


0.03-T C 










' 








rs40132 




















rs26722 


0 02-T C 








o or 


n no 




ASIP 20qll.2— ql2 




rs2424987 


0.12-GA 


n is 


n 


' 


n 84 


n sn 


n° ne ■ ■ 
rown inses, Kanetsky 

et al. (2002) 






rs2424984 


0 13-CT 


0 14 


0 81 


0 73 




0 75 


None 






rs819135 




* 












CYP1A2 15q22-ter 




AY392136 


0 45-GC 




n na 




nil 


nm* 


w° ne 
None 


CYP1B1 2p21 




rs 162560 


0 24-T C 


ooj 

' 


nil* 


009* 


nil 


n 4R 








rs 1056837 












n 7i 


None 


CYP2C8 10q23 




rs 1926705 


0 32-C T 


nm 


n in 


n sn 


004 


007* 


one 


_ 


rsl341164 




0 70 




n 9i 


n 14 


nil 








AY392132 


0 12-GA 










n 84 


None 


CYP2C9 10q24 




AY392131 


0.22-C T 




' 


0 07* 


nil 

' 


0 05* 


one 


DCT 13q32 




rs 1407995 


0 18-T C 




■Jnm* 


n tn 




n 14 


None 






rsl32561 1 


0.18-C T 








nai 


nna 






3 


rs2892681 




042 


037 


044 


056 


075 


None 




4 


rsl 028806 


0.20-GA 


0.94 


0.45 


0.37 


0.66 


0.40 


None 






rs2031527 


0.21-T C 
















6 


rs2296498 


0.11-GA 


0.37 


0.88 


0.51 


0.66 


0.91 


None 


GSTT2 22qll.23 




rs2140186 








0.37 






None 






0 49-GA 


n nfi 


n w 


oo!* 


n 74 




None 


\| A H A Vnl 1 A_1 1 « 




rs979605 


0 28-T C 


o «/ 


005* 


"'^ 


find 


n 97 


None 




*-« 9/1*79 *7Aft 

rS4U Id. 1 ^5 




n if 


n 04* 




n no 


n ob 


. _ 


uriT) 1 c n 9A a 




rsiououu / 


0 09-T C 


0.14 


n i8 

0.18 


n si 


n K7 


n 9Q 


Red hair/blue irises, 


















multiple authors 






rsioUDUUo 


0 07-T C 


0.37 


0.69 


0.25 




0.64 


Red hair/blue irises, 
m whiple authors 






™ 9990470 


0 08-T C 


0.01 


0.84 


0.58 


0.94 


' 


Red hair/blue inses, 
multiple authors 






rsl 724630 


0 20-G C 














_ 


rsl 724631 


0 24-T G 


<0 01 






055 


041 


N° ne 






rsz^yuooz 








n 94 












IS/0ZOD4: 


0 38-C T 




n 99* 


n ns 


nil 




N°" e 






r-cl 79Afi^Q 

rsi /^looy 


0 38T C 


n K1 


n 17* 

J 






n 89 








rsyoooyz 


n tq^&'r 








n 


na7 


OIle 






rslDlOiOD 






n iq 

io 






nog 


None 






rs2242057 


n k n 






n 12 


n oo 


n ok 


None 






rs 2 899 48 8 


0 16-TC 


0 71 


0 89 


0 69 


0 99 


0 78 






10 


rsl869126 


0.16-1\C 


0.65 


0.93 


0.53 


0.70 


0.79 


None 


VJUA4 loqii.4-qiz 






n iIla'p 


0.34 


<0.01* 


<0.07* 




<0.01* 


None 




AY392135 






<0 nm* 






<0 n ° n * 


None 








0.^8-G,A 


nan 




ooj* 


on;* 




None 






rsl900758 


0.30-G.A 


0 81 


<0 01* 


<0 W* 


<( 101* 


<0 01* 






5 


rsl037208 


0.18-CA 


0.13 


<0.01* 


<0.01* 


<0.01* 


<0.01* 


None 




6 


rsl800411 


0.27-C.T 


0.78 


<0.01* 


<0.01* 


<0.01* 


0.02* 






7 


rsl800404 


0.21-GA 


0.95 


<0.01* 


<0.01* 


<0.01* 


0.02* 






8 


AY392133 


0.04-C.G 


0.65 


0.02 


0.07 


0.01* 


<0.01* 


None 




9 


rsl800401 


0.04-T.C 


0.63 


0.03 


0.02* 


<0.01* 


<0.01* 


Brown irises, Rebbeck 
et al. (2003) 




10 


rs2044627 


0.38-GA 


0.19 


0.05* 


0.29 


0.02* 


0.03* 






11 


rsl448483 


0.04-GA 


<0.01 


0.13 


0.06 


0.31 


0.25 






12 


rs737051 


0.29-GA 


0.16 


0.37 


0.39 


0.16 


0.08 


None 




13 


rsl800410 


0.11-GA 


0.96 


0.46 


0.5 


0.25 


0.12 





(continued) 



TABLE 2 



(Continued) 





Haplotype 










Brown/ 


Brown/ 




Gene 


order" 


Sequence (minor) 4 


HWE-/* 


Colors P 


Shade P 


blueP 


other P 


Pigment history 


POMC 2p23 




rs934778 0 30-C T 


nil 


'°l 


n cL 






None 


CTT \T 1 g„ 1 9__ 1 A 




rsiU04AUu u. lo-c, i 


n na 








n iq 


None 






rsl052165 0.29-T,C 


0 53 


0 42 


0 76 


0 92 


0 37 




TYRllql4-q21 




rsl827430 0.45-GA 


0.41 


0.06 


0.37 


0.52 


0.65 


None 




2 


rsl042602 0.36-A,C 


0.06 


0.64 


0.55 


0.72 


0.45 






3 


rsl851992 0.42-GA 


0.19 


0.92 


0.97 


0.76 


0.58 




TYRP1 9p23 


1 


rs2733832 0.41-C.T 


0.45 


0.02* 


<0.01* 


<0.01* 


<0.01* 


None 


2 


rs2075508 0.06-QT 


0.54 


0.06 


<0.01* 


0.04 


0.08 






3 


rs683 0.35-G.T 


0.19 


0.01 


<0.01* 


<0.01* 


<0.01* 


None 




4 


rs2762464 0.35-A.T 


0.85 


0.03 


0.04* 


0.01* 


<0.01* 


None 




5 


AY395737 0.06-T,C 


0.89 


0.39 


0.09 


0.13 


0.17 


None 




6 


AY395736 0.06-T.G 


0.92 


0.62 


0.20 


0.24 


0.43 


None 



Asterisks represent P values that remained significant after the correction for multiple tests and P values in italic are those 
that were statistically significant (P ^ 0.05). 
" Haplotype order refers to the order of the SNPs in the haplotypes shown in Table 4 and described in the text. 
4 Frequency of the minor allele and the major and minor allele nucleotide. 

' Hardy-Weinberg equilibrium P value, where a value <0.05 indicates that the alleles are not in equilibrium. 



2) although some associations were found within non- 
pigmentation genes such as CYP2C8 at 10q23, CYP2C9 
at 10q24, CYP1B1 at 2p21, and MAOA at Xpll.S. No 
significant SNP associations within the pigmentation 
genes SILV, MC1R, ASIP, POMC, RAB, or TYR were 
found, although 7T7?had one SNP with a P = 0.06. The 
most strongly associated of the marginally associated 
SNPs were from the OCA2, TYRP1, and AIM genes, in 
order of the strength of association, which is the same 
order as that provided using the number of marginally 
associated SNPs, rather than their strength. 

Since most of the SNPs identified from this approach 
localized to discrete genes or chromosomal regions, we 
grouped all of the SNPs from each locus and tested 
inferred haplotypes for association with iris colors using 
contingency analysis. We did not confine this higher- 
order analysis to those genes with marginal SNP associa- 
tions, but we grouped all of the high-frequency SNPs 
tested for each gene. For each gene, we inferred haplo- 
types and used contingency analyses to determine which 
haplotypes were statistically associated with iris colors. 
From the chi-square and adjusted residuals, we found 
43 haplotypes for 16 different loci to be either positively 
(agonist) or negatively (antagonist) associated with iris 
colors (Table 3). The strongest associations were ob- 
served for genes with SNPs that were marginally associ- 
ated (Table 2) and most of the genes with marginal SNP 
associations had haplotypes and diplotypes (sometimes 
referred to as multilocus gene-wise genotypes or diploid 
pairs of haplotypes) positively (agonist) or negatively 
(antagonist) associated with at least one iris color (Table 

3) . A few of the genes/ regions not harboring a margin- 
ally associated SNP had haplotypes and diplotypes posi- 
tively and/ or negatively associated with iris colors (ASIP 



gene, 1 haplotype; MC1R gene, 2 haplotypes; Tables 2 
and 3). In other words, their SNPs were associated with 
iris colors only within the context of gene haplotypes 
or diplotypes. For some, associations with iris colors 
were found only within the context of diplotypes, but 
not at the level of the SNPs or the haplotype (i.e., SILV 
and GSTT2 genes located at 22qll.23). At the level of 
the haplotype, each gene or region had unique numbers 
and types of associations. For example, OCA2, AIM, DCT, 
and TYRP1 harbored haplotypes both positively associ- 
ated with blue irises and negatively associated with 
brown irises ( OCA2 haplotypes 1, 37, 38, 42; A/Mhaplo- 
type 1; DCThaplotype 2; and TYRP1 haplotype 1; Table 
3). Others genes such as AIM, OCA2, and TYRP1 har- 
bored haplotypes positively associated with brown but 
negatively associated with blue color (AIM haplotype 2; 
OCA2 haplotypes 2, 4, 45, 47; TYRP1 haplotype 4; Table 
3) while others, such as the MY05A, OCA2, TYRP1, and 
CYP2C8 genes located at 10q23, harbored haplotypes 
positively associated with one color but not negatively 
associated with any other color (MY05A haplotype 5 
and haplotype 10, OCA2 haplotype 19, TYRP1 haplotype 
3, and CYP2C8 haplotype 1; Table 3). The MC1R gene 
harbored haplotypes associated only with green color 
in our sample and the POMC gene harbored a single 
SNP with genotypes weakly associated with iris colors 
(no significant haplotypes or diplotypes were found). 
Overall, the diversity of haplotypes associated with 
brown irises was similar to that of haplotypes associated 
with blue irises. Most of the haplotypes were even more 
dramatically associated with iris colors in a multiracial 
sample (data not shown), because many of the SNPs 
comprising them are good AIMs and variants associated 
with darker iris colors were enriched in those ancestral 
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The common haplotypes and diplotypes for the 16 iris color genes discussed in the text 





Haplotypes ot 


Agonist* 


Agonist 1 


Antagonist 


Antagonist 




.sequence ID 


16 genes 0 


chi-square P 




chi-square P 


color 


Count' 






AIM (MATP) 5pl3.3 








i 


C AC 


0.0010 


Blue 


0.004 


Brown 


(1641) 






0.048 








(33) 


3 


T A C 






— 


— 


(23) 






ASIP20qll.2-ql2 








1 


ATA 










(979) 


2 


AT G 












(508) 


3 


GCG 








z 


z 


(196) 


4 


ACA 


0.017 


Hazel 


— 


— 


(13) 






DCT 13q32 










1 


CTGAC A 


— 


— 






(625) 


2 


CTC AC A 


0.014 


Hazel 


0.028 


Blue 


(242) 






0.016 


Blue 


0.021 


Brown 




3 


TCGACA 


0.048 


Green 


0.003 


Hazel 


(281) 




C T C G T A 






— 


— 


(320) 


5 


C T G A C G 


Z 




— 


— 


(179) 






MC1R 16q24.3 










1 


T C C 


0.016 


Green 






(152) 


2 


C C C 






0.026 


Green 


(1294) 


3 


C C T 










(143) 


4 


CTC 










(113) 






MY05A 15q21 










1 


CGATCGGCCC 


0.000 


Green 




— 


(51) 




CGATCAGCCC 












4 


GTGCTGATCC 






0.008 


Blue 


(163) 


5 


CGATCAACCC 


0.017 


Blue 






(40) 


6 


GTACTGATCC 










(117) 


8 


C GACTGGTTT 










(165) 


10 


CGACCAGCCC 


0.003 


Brown 






(40) 


13 


CTACTGGTTT 










(71) 


14 


CGATTAGCCC 






0.027 


Blue 


(40) 


16 


CGGCCAATCC 










(19) 






OCA2 15qll.2-ql2 








1 


GGGGACGGCAAAG 


0.002 


Blue 


0.01 


Brown 


(44) 


2 


GAGGCCGGCAAGA 


0.018 


Hazel 


0.031 


Blue 


(43) 






0.022 


Brown 








3 


GAGGCCAGCAAGA 


0.042 


Green 


0.026 


Blue 


(21) 


4 


GAGGCCGGC AAAA 


0.024 


Brown 


0.014 


Blue 


(89) 


7 


GAGGCCAGCAAAA 






— 


— 


(13) 


15 


TGGGACGCTAAAG 






— 


— 


(17) 


19 


GAGGCCAGCGAGA 


0.019 


Brown 


— 


— 


(23) 


22 


GAGGCCGGCGAAA 


0.036 




0.012 


Blue 


(15) 


25 


TAAGCCAGCGAAA 






— 


— 


(13) 


37 


GGAAATAGC AAAA 


<0.001 


Blue 


<0.001 


Brown 


(508) 


38 


GGAAATAGCGAAA 


0.025 


Blue 


0.011 


Brown 


(200) 


39 


GGAAATAGCAAGA 






0.039 




(71) 


41 


GGAAATAGC AGAA 










(19) 


42 


GGAAATAGCGAGA 


0.029 


Blue 


0.021 




(174) 


45 


TGAAATAGCGAAA 


<0.001 




0.007 


Blue 


(65) 


47 


TGAAATAGCGAGA 


0.003 


Brown 


0.006 


Hazel 


(22) 










0.036 


Blue 




48 


TGAAATAGCAAAA 


0.001 


Brown 


0.015 


Green 


(35) 


57 


TGAAATAGCAAGA 










(30) 


(continued) 
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Haplotypes of 
16 genes" 



Agonist* Agonist 4 
chi-square P color 



Antagonist* 
chi-square P 



CC 
TT 
TC/TC 

GCG 
A A G 
GCA 
AAA 
AC A 
ACG 
GAG 
G A A 

TTTTCG 
CTTTTT 
CTGACG 
CCGACG 
TTGTCG 
CTTACG 



CC 
TT 
CT 



C AA 
TAG 
TGA 
T A A 



AG 

GA 

AA 

AG/AG 

AG/GA 



SILV 12ql3-ql4 



TYRllql4-q21 



CYPlA2-15q22-ter 
0.015 Brown 
0.023 Hazel 

CYP1B1-2 P 21 
0.027 Hazel 



CYP2C8-10q23 
Brown 



CYP2C9-10q24 
Brown 
Green 

GSTT2-22qll.23 



MAOA-Xpll.4-11 
0.006 Blue 



0.035 
0.01 



0.023 
<0.001 



Blue 
Green 



Blue 
Brown 



(1187) 
(515) 



(913) 
(296) 
(490) 



(184) 
(257) 
(467) 
(231) 
(254) 
(189) 
(83) 
(37) 



(454) 
(98) 
(23) 
(24) 

(769) 
(933) 

(957) 
(403) 
(337) 

(539) 
(201) 
(513) 
(439) 



(1325) 
(377) 



(821) 
(778) 
(99) 
(184) 
(401) 



(1133) 
(480) 
(87) 



Sequences of the highest order of complexity within a locus found to be associated with iris colors. All of the major sequences 
— * ^13) for each locus with at least one significantly associated sequence are shown. If no haplotypes or diplotyp— r ~ 



(count 2:13) for each locus with at least one significantly associated sequence are shown. If no haplotypes or diplotypes __. 
locus were found to be associated, only the SNP alleles are shown. If no haplotypes were found to be associated for a locus but 
diplotypes were found to be associated, both the haplotypes and the diplotypes are shown. 

' Agonist color refers to the color with which the sequence is positively associated. Antagonist color refers to the color with 
which the sequence is negatively associated. Chi-square P value is shown. 

' Number of times the haplotype was observed in our sample of 851. 
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groups of the world that are of darker average iris color 
(Frudakis et al. 2003; data not shown). Most of the 
SNPs within a gene or region were in LD with others 
in that gene or region {\D'\ a 0.05); only 32 SNP 
pairs— in the MC1R (1 pair), OCA2 (27 pairs), TYR (2 
pairs), and TYRP1 (2 pairs) genes — were found to be 
in linkage equilibrium (not shown). 

These analyses resulted in the identification of 61 
SNPs in 16 genes/chromosomal regions associated with 
iris colors on one level or another; details for each and 
whether the SNP is marginally associated or associated 
within the context of the haplotype and/or diplotype 
are shown in Table 2. The minor allele frequency for 
most of these SNPs was relatively high (average F minor 
allele = 0.22) and most of them were in Hardy-Weinberg 
equilibrium (HWE; those for which HWE P> 0.05, 28/ 
34; Table 3). Nine were not and of these 2 were of 
relatively low frequency with weak evidence for disquili- 
brium (P value close to 0.05) . Lack of HWE is usually 
an indication of a poorly designed genotyping assay, 
but none of the remaining 7 SNPs exhibited genotyping 
patterns that we have previously associated with such 
problems (such as the complete absence of an expected 
genotype class or all genotypes registering as heterozy- 
gotes). Indeed, one of those for which the evidence of 
lack of HWE was the strongest was validated as a legiti- 
mate SNP through direct DNA sequencing (data not 
shown) . The chromosomal distribution of the SNPs that 
were significandy associated in a marginal sense was 
found to be independent of the distribution of SNPs 
actually surveyed, indicating that the associations were 
not merely a function of SNP sampling and the same 
was true for the distribution of all the SNPs shown in 
Table 2 (data not shown). Chromosome 15q harbored 
the majority (14/27) of the SNPs that were marginally 
associated with iris colors, and all but one of these 14 
were found in two different genes: OCA2 and MY05A 
(Table 2). Chromosome 5p had 3 SNPs marginally asso- 
ciated, all in the AIM gene, and chromosome 9p had 5 
SNPs associated, all in the TYRP1 gene. Multiple SNPs 
were identified on chromosome lOq; the CYP2C#-10p23 
region had 1 marginally associated SNP, and the neigh- 
boring region, CKP2C5"-10p24, also had one. As one 
might expect from the proximity of these two regions, 
CYP2C8-CYP2C9 marker pairs were found to be in tight 
LD with one another (P< 0.001 for each possible pair). 
Multiple SNPs were also identified on chromosome 2; 
the C/C genotype for the POMC SNP located at 2p23 
was associated with blue iris color (Table 3) and a 
CYPlBl-2p21-region SNP was also marginally associated 
at the level of iris shade (Table 2), as well as within 
the context of a 2-SNP haplotype (Table 3). The SNPs 
between the 2p21 and 2p23 regions were also in LD 
(P < 0.01). Finally, in addition to the OCA2 (15qll.2- 
ql2) and MY05A (15q21) sequences, a single SNP 
(15q22-ter) was also implicated on chromosome 15q, 
but SNPs between each of these three loci were not 



found to be in LD (data not shown) . SNPs for the MC1R 
(16q24), SILV (12ql3), and TYR (llq) genes and for 
the MAOA-Xpl 1.4-1 1.3 and GS7T2-22qll.23 regions 
were also found to be associated at the level of the 
haplotype (Tables 3 and 4), although these were the 
only regions of these chromosomes for which associa- 
tions were found. 

The P values we obtained suggested that diplotypes 
explained more iris color variation than did haplotypes 
or individual SNPs. To test this, we performed a cor- 
rected ANOVA analysis for our data on each of these 
three levels. We considered all 61 SNPs in Table 2, their 
haplotypes in Table 3, and their diplotypes (not shown) . 
Diplotypes explained 15% of the variation, whereas hap- 
lotypes explained 13% and SNPs explained only 11% 
(Table 4) after correcting for the number of variables. 
The most strongly associated 68 genotypes of the 543 
genotypes observed for the 16 genes/ regions, on the 
basis of chi-square-adjusted residuals, explained 13% of 
the variation (last row in Table 4) . 

DISCUSSION 

From a screen of 754 SNP loci, we have identified 61 that 
are statistically associated with variable iris pigmentation 
at one level of intragenic complexity or another. The 
remaining SNPs had 5 values and chi-square lvalues that 
were not significant on any level of intragenic complex- 
ity. Diplotypes for these 61 alleles explained most of the 
iris color variance in our sample; the lowest amount was 
explained at the level of the SNP, suggesting an element 
of intragenic complexity to iris color determination {i.e., 
dominance). Only about half of the 61 SNPs that we 
identified were associated with iris colors indepen- 
dently — the others were associated only in the context 
of haplotypes or diplotypes. Even at this level of com- 
plexity, the sequences from no single gene could be 
used to make reliable iris color inferences, which sug- 
gests an element of intergenic complexity (i.e., epistasis) 
for iris color determination as well. Aside from the fact 
that many of the SNPs we identified were significant 
after imposing the Steenland correction for multiple 
testing, there are three lines of evidence that the SNPs 
we have identified are not spuriously associated. The 
first is that for most of the genes for which we identified 
marginally associated SNPs, multiple such SNPs were 
identified. In other words, the distribution of SNPs 
among the various genes tested was not random. In- 
deed, the associations were observed to be generally 
stronger for the SNPs in the context of within-gene 
haplotypes — a result that would not necessarily be ob- 
tained for individual SNPs spuriously associated — 
suggesting that the gene sequences themselves are asso- 
ciated, not merely a spurious polymorphism within each 
gene. Second, although a roughly equal number of pig- 
mentation and nonpigmentation gene SNPs were 
tested, of the 34 marginally associated SNPs, 28 of them 
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TABLE 4 



ANOVA-SNP and haplotype data 









No. of 








Source 


Model d.f. 


Error d.f. 


variables 




lvalue 


Adjusted R l 


SNP data 


62 


788 


62 


2.63 


0.17 


0.11 


Haplotype data 


212 


638 


216 


1.58 


0.34 


0.13 


Multilocus gene-wise genotype dat 


a 543 


307 


572 


1.27 


0.69 


0.15 


Multilocus gene-wise genotype dac 


i 68 


782 


68 


2.82 


0.2 


0.13 



(82%) were in pigmentation genes. In other words, the 
distribution of SNPs among the various gene types was 
also not random. Third, when applied to a sample in- 
cluding individuals of multiple ancestries, the linear 
and nonlinear variables from these and the other genes 
combined performed even better than when applied 
just to individuals of majority European ancestry (not 
shown) . Since most individuals of non-European or mi- 
nority European descent exhibit low variability in iris 
colors (on average of darker shade than individuals of 
European descent), this improvement may not seem 
surprising. However, this result would not have necessar- 
ily been obtained were we working with SNPs that were 
not truly associated with iris colors. Although correc- 
tions for multiple testing left most of the SNP-level asso- 
ciations intact, a number of the associations we found 
did not pass the multiple-testing examination, but none- 
theless we present them here to avoid possible type II 
error; the sequences may be weakly associated with iris 
colors and possibly relevant within a multiple-gene 
model for classification (i.e., epistasis). For these, it 
would seem more prudent to eliminate false positives 
downstream of SNP identification, such as from tests of 
higher-order association, using various other criteria, 
such as those described above, or possibly using the 
utility of the SNP for the generalization of a complex 
classification model when one is finally described. 

Mutations in the pigmentation genes are the primary 
cause of oculocutaneous albinism so it was natural to 
expect that common variations in their sequences might 
explain some of the variance in natural iris colors, and 
this is in fact what we observed. However, a number of 
the associations we identified were for SNPs located in 
other types of genes. The sequences for most of these 
genes vary significantly as a function of population struc- 
ture (Frudakis et al. 2003) and it is possible that alleles 
for these SNPs are associated with elements of popula- 
tion structure that correlate with iris colors. Alterna- 
tively, the mechanism for the associations could be LD 
with phenotypically active loci in nearby pigment genes. 
Indeed, some, but not all, of our nonpigment gene SNPs 
are found in regions within the vicinity of pigmentation 
genes; CYP2C8 and CYP2C9are located on chromosome 
10 near the HPS1 and HPS2 pigmentation genes (which 
we did not test directly) , CYP1A2 is located at 15q22-ter 



on the same arm as OCA2 and MY05A, CYP1B1 is located 
at 2p21 in the vicinity of the POMC gene at 2p23, and 
MAOA is located on the same arm of chromosome X 
(Xpll.4-11.3) as the OA1 pigmentation gene (which 
we also did not test direcdy). The distances between 
these loci associated with iris colors and "neighboring" 
pigmentation genes is far greater than the average ex- 
tent of LD in the genome, and if it is the case that these 
associations are through LD, it would seem that, again, 
population structure would need to be invoked as an 
explanation. The structure behind our results is unlikely 
to be of a crude (i.e., continental) nature; although 
two-thirds of our European-American samples were of 
significant (4%) BGA admixture, few correlations be- 
tween structure measured on this level and iris colors 
were observed in this study. Rather, it seems likely that 
the structure behind our results is of a finer, more "cryp- 
tic" nature, such as ethnicity or even within-ethnic-group 
structure. To an investigator interested in elucidating 
a biological mechanism, association due to population 
structure might not seem to be very satisfying, but when 
classification is the goal rather than the elucidation of 
a biological mechanism, it would seem to matter little 
why a marker is associated with a trait. For example, 
forensics investigators construct physical profiles using 
surprisingly unscientific means; only in rare cases are 
eye-witness accounts available, and in certain circum- 
stances these accounts are subjective and unreliable. A 
battery of genetic tests, of which one for the inference 
of iris color could be a part, could enable the construc- 
tion of a more objective and science-based (partial) 
physical profile from crime-scene DNA, and an investi- 
gator using these tests would be less interested in the 
biological mechanism of the phenotype than in an abil- 
ity to make an accurate inference of trait value. Of 
course, identifying markers in LD with phenotypically 
active loci (or the phenotypically active loci themselves) 
would provide for more accurate classification (as well 
as for a better understanding of biological mechanism) , 
but the hunt for these elusive loci in heterogeneous 
populations is still impractical because LD extends only 
for a few kilobases and the economics of genome-wide 
scans in heterogeneous samples with full LD coverage 
are out of reach for most labs. 

Linkage studies have implicated certain pigmentation 
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genes as specifically relevant for pigmentation pheno- 
types, and most of the pigmentation gene SNPs that 
we identified clustered to certain genes such as OCA2, 
MY05A, TYRP1, and AIM. Further, certain of our results 
support the previous literature. Most of the SNPs that 
we identified were on chromosome 15, which Eiberg 
and Mohr (1996) described from linkage analyses as the 
primary chromosome for the determination of "brown- 
ness." As suggested by these authors, the candidate gene 
within the interval containing this locus (BEY2) is most 
likely the OCA2 gene, although the MY05A gene is also 
present within this interval and, as shown here, associ- 
ated with iris colors. OCA2 associations were by far the 
most significant of any gene or region we tested, while 
MY05A SNPs were only weakly associated (but haplo- 
types and diplotypes more strongly). MY05A alleles 
were not found to be in LD with those of OCA2, sug- 
gesting that these results were independently obtained 
and that Eiberg and Mohr's results may have been a 
reflection of the activity of two separate genes. Rebbeck 
et al. (2002) recently described two OCA2 coding changes 
associated with darker iris colors. One of these, the 
Arg305TRP SNP, was one of the 13 OCA2 SNPs that we 
found to be strongly associated with iris colors using all 
four of our color criteria, although its association was 
only the ninth strongest among the OCA2 SNPs that we 
identified and the eleventh strongest among all of the 
associated SNPs that we identified. The P values we 
obtained for this particular SNP association (P = 0.01- 
0.05, depending on the color criteria) were less signifi- 
cant than those described (P = 0.002) by Rebbeck et 
al. (2002). In addition, we independently isolated the 
"red hair/blue iris" SNP alleles described by Valverde 
et al. (1995) and Koppula et al. (1997), suggesting that 
these sequences are indeed associated with iris pigmen- 
tation as suggested by these authors, although we note 
that the associations described by these authors were 
with blue irises and at the level of the SNP, while those 
that we observed were with green irises and apparent 
only at the level of the haplotypes and diplotypes. We 
also identified associations in the ASIP gene, which sup- 
ports previous work by Kanetsky et al. (2002) , although 
it should be noted that we did not observe this gene 
association at the level of the SNP as they did; one of 
the ASIP SNPs that we identified (marker 861, Table 2) 
is the 8818 G-A SNP transversion that they described to 
be associated with brown iris colors, but from our study 
the association was with hazel color at the level of the 
haplotype. Last, we also showed that the associations 
between TYR haplotypes and iris colors were relatively 
weak, which is not inconsistent with results obtained by 
many others before us working in the field of oculocuta- 
neous albinism who have failed to find strong associa- 
tions in smaller samples. Although our results indepen- 
dendy verified findings for OCA2, ASIP, and MC1R, they 
also show that several other pigmentation genes harbor 
alleles associated with the natural distribution of iris 



colors (TYRP1, AIM, MY05A, and DCT). Therefore, it 
seems that our findings indicate that most of the previ- 
ous results associating pigmentation gene alleles with iris 
colors, taken independendy, represent merely strokes of 
a larger, more complex portrait. It is interesting that 
most of the SNPs that we discovered are noncoding, 
either silent polymorphisms or SNPs residing in the 
gene proximal promoter, intron, or 3' UTR, which is 
not altogether unusual. Although this could indicate 
that the SNPs are in LD with other phenotypically active 
loci, it may also be a reflection that variability in message 
transcription and/ or turnover may explain part of the 
variability observed in human iris colors. Although we 
screened a large number of SNPs, some of the genes 
harbor a large number of candidate SNPs and we did 
not test them all. For example, the OCA2 has ~200 
known candidate SNPs in NCBI's dbSNP, and it is possi- 
ble that this gene has more to teach us about variable 
human iris pigmentation than what we have learned 
from the work presented herein. 

Clearly work remains to be done, objectifying the 
collection of iris colors from subjects, enhancing the 
sample size so that epistatic interactions can be ex- 
plored, possibly screening other regions of the genome 
not screened here, and modeling the sequences that 
we have described to enable classification of iris colors 
from DNA. However, the results presented herein con- 
stitute a good first step toward solving what our results 
confirm is a very complex genetics problem. When this 
work is more fully developed, it may be possible to assign 
an iris color to an individual sample with reasonable 
certainty, and surely in this case the results herein will 
have some tangible value for the field of forensic sci- 
ence. Alternatively, as a research tool, the common hap- 
lotypes that we have identified and the complex, biologi- 
cally relevant contexts within which they are found may 
help researchers more accurately define risk factors for 
pigmentation-related diseases such as cataracts and mel- 
anoma. 
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